# Frozen Pretrained Models
Image Captioning
MIT
BLIP-2 is a vision-language model that combines an image encoder with a large language model for image-to-text generation tasks.
Image-to-Text
Transformers English

I
getZuma
25
0
Blip2 Opt 2.7b Coco
MIT
BLIP-2 is a vision-language pretrained model that guides language-image pretraining by freezing the image encoder and large language model.
Image-to-Text
Transformers English

B
Salesforce
3,900
9
Blip2 Opt 2.7b
MIT
BLIP-2 is a vision-language model that combines an image encoder with a large language model for image-to-text generation tasks.
Image-to-Text
Transformers English

B
Salesforce
867.78k
359
Featured Recommended AI Models